Open Data
   HOME

TheInfoList



OR:

Open data is
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...
that is openly accessible, exploitable, editable and shared by anyone for any purpose. Open data is licensed under an
open license A free license or open license is a license which allows others to reuse another creator’s work as they wish. Without a special license, these uses are normally prohibited by copyright, patent or commercial license. Most free licenses are ...
. The goals of the open data movement are similar to those of other "open(-source)" movements such as open-source software, hardware, open content,
open specifications An open specification is a specification A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard. There are different typ ...
,
open education Open education is an educational movement founded on openness, with connections to other educational movements such as critical pedagogy, and with an educational stance which favours widening participation and inclusiveness in society. Open educa ...
,
open educational resources Open educational resources (OER) are teaching, learning, and research materials intentionally created and licensed to be free for the end user to own, share, and in most cases, modify. The term "OER" describes publicly accessible materials and ...
,
open government Open government is the governing doctrine which sustain that citizens have the right to access the documents and proceedings of the government to allow for effective public oversight. In its broadest construction, it opposes reason of state and ...
,
open knowledge Open knowledge (or free knowledge) is knowledge that is free to use, reuse, and redistribute without legal, social, or technological restriction. Open knowledge organizations and activists have proposed principles and methodologies related to th ...
,
open access Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers. With open access strictly defined (according to the 2001 definition), or libre op ...
, open science, and the open web. The growth of the open data movement is paralleled by a rise in intellectual property rights. The philosophy behind open data has been long established (for example in the Mertonian tradition of science), but the term "open data" itself is recent, gaining popularity with the rise of the Internet and
World Wide Web The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet. Documents and downloadable media are made available to the network through web se ...
and, especially, with the launch of open-data government initiatives such as
Data.gov Data.gov is a U.S. Government website launched in late May 2009 by the Federal Chief Information Officer (CIO) of the United States, Vivek Kundra. Data.gov aims to improve public access to high value, machine readable datasets generated by t ...
,
Data.gov.uk data.gov.uk is a UK Government project to make available non-personal UK government data as open data. It was launched in closed beta in September 2009 and publicly launched in January 2010. As of February 2015 it contained over 19,343 datasets, r ...
and
Data.gov.in Open Government Data (OGD) Platform India or data.gov.in is a platform for supporting Open data initiative of Government of India. This portal is a single-point access to datasets, documents, services, tools and applications published by ministr ...
. Open data can be
linked data In computing, linked data (often capitalized as Linked Data) is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but ...
- referred to as linked open data. One of the most important forms of open data is open government data (OGD), which is a form of open data created by ruling government institutions. Open government data's importance is born from it being a part of citizens' everyday lives, down to the most routine/mundane tasks that are seemingly far removed from government. The abbreviation is sometimes used to indicate that the dataset or database in question complies with the principles of FAIR data and carries an explicit data‑capable
open license A free license or open license is a license which allows others to reuse another creator’s work as they wish. Without a special license, these uses are normally prohibited by copyright, patent or commercial license. Most free licenses are ...
.


Overview

The concept of open data is not new, but a formalized definition is relatively new. Open data as a phenomenon denotes that governmental data should be available to anyone with a possibility of redistribution in any form without any copyright restriction. One more definition is the Open Definition which can be summarized as "a piece of data is open if anyone is free to use, reuse, and redistribute it – subject only, at most, to the requirement to attribute and/or share-alike." Other definitions, including the
Open Data Institute The Open Data Institute (ODI) is a non-profit private company limited by guarantee, based in the United Kingdom. Founded by Sir Tim Berners-Lee and Sir Nigel Shadbolt in 2012, the ODI’s mission is to connect, equip and inspire people around th ...
's "open data is data that anyone can access, use or share," have an accessible short version of the definition but refer to the formal definition. Open data may include non-textual material such as
map A map is a symbolic depiction emphasizing relationships between elements of some space, such as objects, regions, or themes. Many maps are static, fixed to paper or some other durable medium, while others are dynamic or interactive. Although ...
s,
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
s,
connectome A connectome () is a comprehensive map of neural connections in the brain, and may be thought of as its "wiring diagram". An organism's nervous system is made up of neurons which communicate through synapses. A connectome is constructed by tr ...
s,
chemical compound A chemical compound is a chemical substance composed of many identical molecules (or molecular entities) containing atoms from more than one chemical element held together by chemical bonds. A molecule consisting of atoms of only one element ...
s, mathematical and scientific formulae, medical data, and practice, bioscience and biodiversity. A major barrier to the open data movement is the commercial value of data. Access to, or re-use of, data is often controlled by public or private organizations. Control may be through access restrictions,
licenses A license (or licence) is an official permission or permit to do, use, or own something (as well as the document of that permission or permit). A license is granted by a party (licensor) to another party (licensee) as an element of an agreeme ...
,
copyright A copyright is a type of intellectual property that gives its owner the exclusive right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, education ...
, patents and charges for access or re-use. Advocates of open data argue that these restrictions detract from the common good and that data should be available without restrictions or fees. Creators of data do not consider the need to state the conditions of ownership, licensing and re-use; instead presuming that not asserting copyright enters the data into the
public domain The public domain (PD) consists of all the creative work A creative work is a manifestation of creative effort including fine artwork (sculpture, paintings, drawing, sketching, performance art), dance, writing (literature), filmmaking, ...
. For example, many scientists do not consider the data published with their work to be theirs to control and consider the act of publication in a journal to be an implicit release of data into the
commons The commons is the cultural and natural resources accessible to all members of a society, including natural materials such as air, water, and a habitable Earth. These resources are held in common even when owned privately or publicly. Commons ...
. The lack of a license makes it difficult to determine the status of a
data set A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the ...
and may restrict the use of data offered in an "Open" spirit. Because of this uncertainty it is possible for public or private organizations to aggregate said data, claim that it is protected by copyright, and then resell it. The issue of
indigenous knowledge Traditional knowledge (TK), indigenous knowledge (IK) and local knowledge generally refer to knowledge systems embedded in the cultural traditions of regional, indigenous, or local communities. According to the World Intellectual Property Orga ...
(IK) poses a great challenge in terms of capturing, storage and distribution. Many societies in third-world countries lack the technicality processes of managing the IK. At his presentation at the XML 2005 conference, Connolly displayed these two quotations regarding open data: * "I want my data back." (Jon Bosak circa 1997) * "I've long believed that customers of any application own the data they enter into it." (This quote refers to Veen's own heart-rate data.)


Major sources

Open data can come from any source. This section lists some of the fields that publish (or at least discuss publishing) a large amount of open data.


In science

The concept of open access to scientific data was established with the formation of the
World Data Center The World Data Centre (WDC) system was created to archive and distribute data collected from the observational programmes of the 1957–1958 International Geophysical Year by the International Council of Science ( ICSU). The WDCs were funded and ...
system, in preparation for the International Geophysical Year of 1957–1958. The International Council of Scientific Unions (now the International Council for Science) oversees several World Data Centres with the mission to minimize the risk of data loss and to maximize data accessibility. While the open-science-data movement long predates the Internet, the availability of fast, readily available networking has significantly changed the context of
Open science data Open scientific data or open research data is a type of open data focused on publishing observations and results of scientific activities available for anyone to analyze and reuse. A major purpose of the drive for open data is to allow the verificat ...
, as publishing or obtaining data has become much less expensive and time-consuming. The Human Genome Project was a major initiative that exemplified the power of open data. It was built upon the so-called
Bermuda Principles The Bermuda Principles set out rules for the rapid and public release of DNA sequence data. The Human Genome Project, a multinational effort to sequence the human genome, generated vast quantities of data about the genetic make-up of humans and ot ...
, stipulating that: "All human genomic sequence information … should be freely available and in the public domain in order to encourage research and development and to maximize its benefit to society". More recent initiatives such as the Structural Genomics Consortium have illustrated that the open data approach can be used productively within the context of industrial R&D. In 2004, the Science Ministers of all nations of the
Organisation for Economic Co-operation and Development The Organisation for Economic Co-operation and Development (OECD; french: Organisation de coopération et de développement économiques, ''OCDE'') is an intergovernmental organization, intergovernmental organisation with 38 member countries ...
(OECD), which includes most developed countries of the world, signed a declaration which states that all publicly funded archive data should be made publicly available. Following a request and an intense discussion with data-producing institutions in member states, the OECD published in 2007 the ''OECD Principles and Guidelines for Access to Research Data from Public Funding'' as a ''soft-law'' recommendation. Examples of open data in science: * data.uni-muenster.de – Open data about scientific artifacts from the University of Muenster, Germany. Launched in 2011. *
Dataverse The Dataverse is an open source web application to share, preserve, cite, explore and analyze research data. Researchers, data authors, publishers, data distributors, and affiliated institutions all receive appropriate credit via a data citation w ...
Network Project – archival repository software promoting
data sharing Data sharing is the practice of making data used for scholarly research available to other investigators. Many funding agencies, institutions, and publication venues have policies regarding data sharing because transparency and openness are consid ...
, persistent data citation, and reproducible research. * linkedscience.org/data – Open scientific datasets encoded as
Linked Data In computing, linked data (often capitalized as Linked Data) is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but ...
. Launched in 2011, ended 2018. * systemanaturae.org – Open scientific datasets related to wildlife classified by animal species. Launched in 2015.


In government

There are a range of different arguments for government open data. Some advocates say that making government information available to the public as machine readable open data can facilitate government transparency, accountability and public participation. "Open data can be a powerful force for public accountability—it can make existing information easier to analyze, process, and combine than ever before, allowing a new level of public scrutiny." Governments that enable public viewing of data can help citizens engage within the governmental sectors and "add value to that data." Open data experts have nuanced the impact that opening government data may have on government transparency and accountability. In a widely cited paper, scholars David Robinson and Harlan Yu contend that governments may project a veneer of transparency by publishing machine-readable data that does not actually make government more transparent or accountable. Drawing from earlier studies on transparency and anticorruption, World Bank political scientist Tiago C. Peixoto extended Yu and Robinson’s argument by highlighting a minimal chain of events necessary for open data to lead to accountability: # relevant data is disclosed; # the data is widely disseminated and understood by the public; # the public reacts to the content of the data; and # public officials either respond to the public’s reaction or are sanctioned by the public through institutional means. Some make the case that opening up official information can support technological innovation and economic growth by enabling third parties to develop new kinds of digital applications and services. Several national governments have created websites to distribute a portion of the data they collect. It is a concept for a collaborative project in the municipal Government to create and organize culture for Open Data or Open government data. Additionally, other levels of government have established open data websites. There are many government entities pursuing Open Data in Canada.
Data.gov Data.gov is a U.S. Government website launched in late May 2009 by the Federal Chief Information Officer (CIO) of the United States, Vivek Kundra. Data.gov aims to improve public access to high value, machine readable datasets generated by t ...
lists the sites of a total of 40 US states and 46 US cities and counties with websites to provide open data, e.g., the state of
Maryland Maryland ( ) is a state in the Mid-Atlantic region of the United States. It shares borders with Virginia, West Virginia, and the District of Columbia to its south and west; Pennsylvania to its north; and Delaware and the Atlantic Ocean to ...
, the state of California, US and
New York City New York, often called New York City or NYC, is the List of United States cities by population, most populous city in the United States. With a 2020 population of 8,804,190 distributed over , New York City is also the L ...
. At the international level, the United Nations has an open data website that publishes statistical data from member states and UN agencies, and the
World Bank The World Bank is an international financial institution that provides loans and grants to the governments of low- and middle-income countries for the purpose of pursuing capital projects. The World Bank is the collective name for the Interna ...
published a range of statistical data relating to developing countries. The
European Commission The European Commission (EC) is the executive of the European Union (EU). It operates as a cabinet government, with 27 members of the Commission (informally known as "Commissioners") headed by a President. It includes an administrative body o ...
has created two portals for the
European Union The European Union (EU) is a supranational political and economic union of member states that are located primarily in Europe. The union has a total area of and an estimated total population of about 447million. The EU has often been des ...
: the
EU Open Data Portal Before data.europa.eu, the EU Open Data Portal was the point of access to public data published by the EU institutions, agencies and other bodies. On April 21, 2021 it was consolidated to the data.europa.eu portal, together with the European Data ...
which gives access to open data from the EU institutions, agencies and other bodies and the European Data Portal that provides datasets from local, regional and national public bodies across Europe. The two portals were consolidated to data.europa.eu on April 21, 2021.
Italy Italy ( it, Italia ), officially the Italian Republic, ) or the Republic of Italy, is a country in Southern Europe. It is located in the middle of the Mediterranean Sea, and its territory largely coincides with the homonymous geographical re ...
is the first country to release standard processes and guidelines under a
Creative Commons Creative Commons (CC) is an American non-profit organization and international network devoted to educational access and expanding the range of creative works available for others to build upon legally and to share. The organization has release ...
license for spread usage in the Public Administration. The open model is called the Open Data Management Cycle and was adopted in several regions such as
Veneto Veneto (, ; vec, Vèneto ) or Venetia is one of the 20 regions of Italy. Its population is about five million, ranking fourth in Italy. The region's capital is Venice while the biggest city is Verona. Veneto was part of the Roman Empire unt ...
and
Umbria it, Umbro (man) it, Umbra (woman) , population_note = , population_blank1_title = , population_blank1 = , demographics_type1 = , demographics1_footnotes = , demographics1_title1 = , demographics1_info1 = , ...
Main cities like
Reggio Calabria Reggio di Calabria ( scn, label= Southern Calabrian, Riggiu; el, label= Calabrian Greek, Ρήγι, Rìji), usually referred to as Reggio Calabria, or simply Reggio by its inhabitants, is the largest city in Calabria. It has an estimated popul ...
and
Genova Genoa ( ; it, Genova ; lij, Zêna ). is the capital of the Italian region of Liguria and the sixth-largest city in Italy. In 2015, 594,733 people lived within the city's administrative limits. As of the 2011 Italian census, the Province of G ...
have adopted this model. In October 2015, the
Open Government Partnership The Open Government Partnership (OGP) is a multilateral initiative that aims to secure concrete commitments from national and sub-national governments to promote open government, empower citizens, fight corruption, and harness new technologies to ...
launched the International Open Data Charter, a set of principles and best practices for the release of governmental open data formally adopted by seventeen governments of countries, states and cities during the OGP Global Summit in
Mexico Mexico (Spanish: México), officially the United Mexican States, is a country in the southern portion of North America. It is bordered to the north by the United States; to the south and west by the Pacific Ocean; to the southeast by Guatema ...
.


In non-profit organizations

Many
non-profit organizations A nonprofit organization (NPO) or non-profit organisation, also known as a non-business entity, not-for-profit organization, or nonprofit institution, is a legal entity organized and operated for a collective, public or social benefit, in co ...
offer open access to their data, as long it does not undermine their users', members' or third party's
privacy rights The right to privacy is an element of various legal traditions that intends to restrain governmental and private actions that threaten the privacy of individuals. Over 150 national constitutions mention the right to privacy. On 10 December 194 ...
. In comparison to
for-profit corporation A for-profit corporation is an organization which aims to earn profit through its operations and is concerned with its own interests, unlike those of the public (non-profit corporation). Structure A for-profit corporation is usually an organization ...
s, they do not seek to monetize their data. OpenNWT launched a website offering open data of elections. CIAT offers open data to anybody who is willing to conduct big data analytics in order to enhance the benefit of international agricultural research.
DBLP DBLP is a computer science bibliography website. Starting in 1993 at Universität Trier in Germany, it grew from a small collection of HTML files and became an organization hosting a database and logic programming bibliography site. Since Nove ...
, which is owned by a non-profit organization
Dagstuhl Dagstuhl is a computer science research center in Germany, located in and named after a district of the town of Wadern, Merzig-Wadern, Saarland. Location Following the model of the mathematical center at Oberwolfach, the center is installed i ...
, offers its database of scientific publications from computer science as open data. Non-profit hospitality exchange services offer trustworthy teams of scientists access to their anonymized data for publication of insights to the benefit of humanity.
Open Icecat Open or OPEN may refer to: Music * Open (band), Australian pop/rock band * The Open (band), English indie rock band * ''Open'' (Blues Image album), 1969 * ''Open'' (Gotthard album), 1999 * ''Open'' (Cowboy Junkies album), 2001 * ''Open'' ( ...
provides product data-sheets and e-commerce usage statistics as open data. Before becoming a
for-profit corporation A for-profit corporation is an organization which aims to earn profit through its operations and is concerned with its own interests, unlike those of the public (non-profit corporation). Structure A for-profit corporation is usually an organization ...
in 2011,
Couchsurfing CouchSurfing is a hospitality exchange service by which users can request homestays or interact with other people who are interested in travel. It is accessible via a website and mobile app. It uses a subscription business model, and while host ...
offered 4 research teams access to its
social network A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for ...
ing data. In 2015, non-profit hospitality exchange services Bewelcome and
Warm Showers Warm Showers (WS) is a non-profit hospitality exchange service for people engaging in bicycle touring. The platform is a gift economy — hosts are not supposed to charge for lodging and are not bound. The legal form is a Colorado 501(c)(3) non ...
provided their data for public research.


Policies and strategies

At a small level, a business or research organization's policies and strategies towards open data will vary, sometimes greatly. One common strategy employed is the use of a data commons. A data commons is an interoperable software and hardware platform that aggregates (or collocates) data, data infrastructure, and data-producing and data-managing applications in order to better allow a community of users to manage, analyze, and share their data with others over both short- and long-term timelines. Ideally, this interoperable cyberinfrastructure should be robust enough "to facilitate transitions between stages in the life cycle of a collection" of data and information resources while still being driven by common data models and workspace tools enabling and supporting robust data analysis. The policies and strategies underlying a data commons will ideally involve numerous stakeholders, including the data commons service provider, data contributors, and data users. Grossman ''et al'' suggests six major considerations for a data commons strategy that better enables open data in businesses and research organizations. Such a strategy should address the need for: * permanent, persistent digital IDs, which enable access controls for datasets; * permanent, discoverable metadata associated with each digital ID; *
application programming interface An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how t ...
(API)-based access, tied to an authentication and authorization service; * data portability; * data "peering," without access, egress, and ingress charges; and * a rationed approach to users computing data over the data commons. Beyond individual businesses and research centers, and at a more macro level, countries like Germany have launched their own official nationwide open data strategies, detailing how data management systems and data commons should be developed, used, and maintained for the greater public good.


Arguments for and against

Opening government data is only a waypoint on the road to improving education, improving government, and building tools to solve other real-world problems. While many arguments have been made categorically, the following discussion of arguments for and against open data highlights that these arguments often depend highly on the type of data and its potential uses. Arguments made on behalf of open data include the following: * "Data belongs to the
human race Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, culture, ...
". Typical examples are
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
s, data on organisms, medical science,
environmental data Environmental data is that which is based on the measurement of environmental pressures, the state of the environment and the impacts on ecosystems. This is usually the "P", "S" and "I" of the DPSIR model where D = Drivers, P = Pressures, S = State ...
following the
Aarhus Convention The UNECE Convention on Access to Information, Public Participation in Decision-making and Access to Justice in Environmental Matters, usually known as the Aarhus Convention, was signed on 25 June 1998 in the Danish city of Aarhus. It entered int ...
. *
Public money Government spending or expenditure includes all government consumption, investment, and transfer payments. In national income accounting, the acquisition by governments of goods and services for current use, to directly satisfy the individual o ...
was used to fund the work, and so it should be universally available. * It was created by or at a government institution (this is common in US National Laboratories and government agencies). * Facts cannot legally be copyrighted. * Sponsors of research do not get full value unless the resulting data are freely available. * Restrictions on data re-use create an anticommons. * Data are required for the smooth process of running communal human activities and are an important enabler of socio-economic development (health care, education,
economic productivity Productivity is the efficiency of production of goods or services expressed by some measure. Measurements of productivity are often expressed as a ratio of an aggregate output to a single input or an aggregate input used in a production proce ...
, etc.)."Big Data for Development: From Information- to Knowledge Societies"
Martin Hilbert (2013), SSRN Scholarly Paper No. ID 2205145. Rochester, NY: Social Science Research Network; https://ssrn.com/abstract=2205145
* In scientific research, the rate of discovery is accelerated by better access to data. * Making data open helps combat "data rot" and ensure that scientific research data are preserved over time. * Statistical literacy benefits from open data. Instructors can use locally relevant data sets to teach statistical concepts to their students. It is generally held that factual data cannot be copyrighted. Publishers frequently add copyright statements (often forbidding re-use) to scientific data accompanying publications. It may be unclear whether the factual data embedded in full text are part of the copyright. While the human abstraction of facts from paper publications is normally accepted as legal there is often an implied restriction on the machine extraction by robots. Unlike
open access Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers. With open access strictly defined (according to the 2001 definition), or libre op ...
, where groups of publishers have stated their concerns, open data is normally challenged by individual institutions. Their arguments have been discussed less in public discourse and there are fewer quotes to rely on at this time. Arguments against making all data available as open data include the following: * Government funding may not be used to duplicate or challenge the activities of the private sector (e.g.
PubChem PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of ...
). * Governments have to be accountable for the efficient use of taxpayer's money: If public funds are used to aggregate the data and if the data will bring commercial (private) benefits to only a small number of users, the users should reimburse governments for the cost of providing the data. * Open data may lead to exploitation of, and rapid publication of results based on, data pertaining to developing countries by rich and well-equipped research institutes, without any further involvement and/or benefit to local communities (
helicopter research Neo-colonial research or neo-colonial science, frequently described as helicopter research, parachute science or research, parasitic research, or safari study, is when researchers from wealthier countries go to a developing country, collect infor ...
); similarly, to the historical open access to tropical forests that has led to the misappropriation ("Global Pillage") of plant genetic resources from developing countries. * The revenue earned by publishing data can be used to cover the costs of generating and/or disseminating the data, so that the dissemination can continue indefinitely. * The revenue earned by publishing data permits non-profit organizations to fund other activities (e.g. learned society publishing supports the society). * The government gives specific legitimacy for certain organizations to recover costs ( NIST in US,
Ordnance Survey , nativename_a = , nativename_r = , logo = Ordnance Survey 2015 Logo.svg , logo_width = 240px , logo_caption = , seal = , seal_width = , seal_caption = , picture = , picture_width = , picture_caption = , formed = , preceding1 = , di ...
in UK). * Privacy concerns may require that access to data is limited to specific users or to sub-sets of the data. * Collecting, 'cleaning', managing and disseminating data are typically labour- and/or cost-intensive processes – whoever provides these services should receive fair remuneration for providing those services. * Sponsors do not get full value unless their data is used appropriately – sometimes this requires quality management, dissemination and branding efforts that can best be achieved by charging fees to users. * Often, targeted end-users cannot use the data without additional processing (analysis, apps etc.) – if anyone has access to the data, none may have an incentive to invest in the processing required to make data useful (typical examples include biological, medical, and environmental data). * There is no control to the secondary use (aggregation) of open data.


Relation to other open activities

The goals of the Open Data movement are similar to those of other "Open" movements. *
Open access Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers. With open access strictly defined (according to the 2001 definition), or libre op ...
is concerned with making scholarly publications freely available on the internet. In some cases, these articles include open datasets as well. *
Open specifications An open specification is a specification A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard. There are different typ ...
are documents describing file types or protocols, where the documents are openly licensed. These specifications are primarily meant to improve different software handling the same file types or protocols, but monopolists forced by law into open specifications might make it more difficult. * Open content is concerned with making resources aimed at a human audience (such as prose, photos, or videos) freely available. *
Open knowledge Open knowledge (or free knowledge) is knowledge that is free to use, reuse, and redistribute without legal, social, or technological restriction. Open knowledge organizations and activists have proposed principles and methodologies related to th ...
.
Open Knowledge International Open Knowledge Foundation (OKF) is a global, non-profit network that promotes and shares information at no charge, including both content and data. It was founded by Rufus Pollock on 20 May 2004 in Cambridge, UK. It is incorporated in England an ...
argues for openness in a range of issues including, but not limited to, those of open data. It covers (a) scientific, historical, geographic or otherwise (b) Content such as music, films, books (c) Government and other administrative information. Open data is included within the scope of the Open Knowledge Definition, which is alluded to in
Science Commons Science Commons (SC) was a Creative Commons project for designing strategies and tools for faster, more efficient web-enabled scientific research. The organization's goals were to identify unnecessary barriers to research, craft policy guidelin ...
' Protocol for Implementing Open Access Data. *
Open notebook science Open-notebook science is the practice of making the entire primary record of a research project publicly available online as it is recorded. This involves placing the personal, or laboratory, notebook of the researcher online along with all raw and ...
refers to the application of the Open Data concept to as much of the scientific process as possible, including failed experiments and raw experimental data. *
Open-source software Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Op ...
is concerned with the
open-source license An open-source license is a type of license for computer software and other products that allows the source code, blueprint or design to be used, modified and/or shared under defined terms and conditions. This allows end users and commercial compa ...
s under which computer programs can be distributed and is not normally concerned primarily with data. *
Open educational resources Open educational resources (OER) are Instructional materials, teaching, learning, and research materials intentionally created and Free license, licensed to be free for the end user to own, share, and in most cases, modify. The term "OER" descri ...
are freely accessible, openly licensed documents and media that are useful for teaching, learning, and assessing as well as for research purposes. *
Open research Open research is research that is openly accessible and modifiable by others. The central theme of open research is to make clear accounts of research methods freely available via the internet, along with any data or results extracted or derived ...
/ open science/
open science data Open scientific data or open research data is a type of open data focused on publishing observations and results of scientific activities available for anyone to analyze and reuse. A major purpose of the drive for open data is to allow the verificat ...
(linked open science) means an approach to open and interconnect scientific assets like data, methods and tools with
linked data In computing, linked data (often capitalized as Linked Data) is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but ...
techniques to enable transparent, reproducible and interdisciplinary research. *Open-GLAM (Galleries, Library, Archives, and Museums) is an initiative and network that supports exchange and collaboration between cultural institutions that support
open access Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers. With open access strictly defined (according to the 2001 definition), or libre op ...
to their digitalized collections. The GLAM-Wiki Initiative helps cultural institutions share their openly licensed resources with the world through collaborative projects with experienced
Wikipedia Wikipedia is a multilingual free online encyclopedia written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and using a wiki-based editing system. Wikipedia is the largest and most-read refer ...
editors. Open Heritage Data is associated with Open GLAM, as openly licensed data in the heritage sector is now frequently used in research, publishing, and programming, particularly in the
Digital Humanities Digital humanities (DH) is an area of scholarly activity at the intersection of computing or Information technology, digital technologies and the disciplines of the humanities. It includes the systematic use of digital resources in the humanitie ...
.


Open Data as commons


Ideas and definitions

Formally both the definition of Open Data and commons revolve around the concept of shared resources with a low barrier to access. Substantially, digital commons include Open Data in that it includes resources maintained online, such as data. Overall, looking at operational principles of Open Data one could see the overlap between Open Data and (digital) commons in practice. Principles of Open Data are sometimes distinct depending on the type of data under scrutiny. Nonetheless, they are somewhat overlapping and their key rationale is the lack of barriers to the re-use of data(sets). Regardless of their origin, principles across types of Open Data hint at the key elements of the definition of commons. These are, for instance, accessibility, re-use, findability, non-proprietarily. Additionally, although to a lower extent, threats and opportunities associated with both Open Data and commons are similar. Synthesizing, they revolve around (risks and) benefits associated with (uncontrolled) use of common resources by a large variety of actors.


The System

Both commons and Open Data can be defined by the features of the resources that fit under these concepts, but they can be defined by the characteristics of the systems their advocates push for. Governance is a focus for both Open Data and commons scholars. The key elements that outline commons and Open Data peculiarities are the differences (and maybe opposition) to the dominant market logics as shaped by capitalism. Perhaps it is this feature that emerges in the recent surge of the concept of commons as related to a more social look at digital technologies in the specific forms of digital and, especially, data commons.


Real-life case

An exemplification of how the relationship between Open Data and commons and how their governance can potentially disrupt the market logic otherwise dominating big data is a project conducted by Human Ecosystem Relazioni in Bologna (Italy). See: https://www.he-r.it/wp-content/uploads/2017/01/HUB-report-impaginato_v1_small.pdf. This project aimed at extrapolating and identifying online social relations surrounding “collaboration” in Bologna. Data was collected from social networks and online platforms for citizens collaboration. Eventually data was analyzed for the content, meaning, location, timeframe, and other variables. Overall, online social relations for collaboration were analyzed based on network theory. The resulting dataset have been made available online as Open Data (aggregated and anonymized); nonetheless, individuals can reclaim all their data. This has been done with the idea of making data into a commons. This project exemplifies the relationship between Open Data and commons, and how they can disrupt the market logic driving big data use in two ways. First, it shows how such projects, following the rationale of Open Data somewhat can trigger the creation of effective data commons. The project itself was offering different types of support to social network platform users to have contents removed. Second, opening data regarding online social networks interactions has the potential to significantly reduce the monopolistic power of social network platforms on those data.


Funders' mandates

Several funding bodies which mandate Open Access mandate Open Data. A good expression of requirements (truncated in places) is given by the
Canadian Institutes of Health Research The Canadian Institutes of Health Research (CIHR; french: Instituts de recherche en santé du Canada; IRSC) is a federal agency responsible for funding health and medical research in Canada. Comprising 13 institutes, it is the successor to the M ...
(CIHR): * to deposit bioinformatics, atomic and molecular coordinate data, experimental data into the appropriate public database immediately upon publication of research results. * to retain original data sets for a minimum of five years after the grant. This applies to all data, whether published or not. Other bodies active in promoting the deposition of data as well as full text include the
Wellcome Trust The Wellcome Trust is a charitable foundation focused on health research based in London, in the United Kingdom. It was established in 1936 with legacies from the pharmaceutical magnate Henry Wellcome (founder of one of the predecessors of Glaxo ...
. An academic paper published in 2013 advocated that
Horizon 2020 The Framework Programmes for Research and Technological Development, also called Framework Programmes or abbreviated FP1 to FP9, are funding programmes created by the European Union/European Commission to support and foster research in the Europea ...
(the science funding mechanism of the EU) should mandate that funded projects hand in their databases as "deliverables" at the end of the project, so that they can be checked for third party usability then shared.


Non-open data

Several mechanisms restrict access to or reuse of data (and several reasons for doing this are given above). They include: * making data available for a charge; * compilation in databases or websites to which only registered members or customers can have access; * use of a
proprietary {{Short pages monitor